Dual Specificity Protein Kinase Cell Division Cycle2-Like 1 (PBD: 5J1V) from Homo sapiens
Created by: John
Wilding
The protein known as dual specificity
protein kinase cell division cycle2-like 1 (CLK1) (PBD ID: 5J1V) is a dual specificity kinase found in Homo sapiens (1). Primarily found in the nucleus, dual specificity
kinases phosphorylate both serine/threonine and tyrosine containing protein
substrates via adenosine triphosphate (ATP) hydrolysis which produces adenosine
diphosphate and a phosphoprotein (1). CLK1 phosphorylates serine(S)- and
arginine(R)-rich proteins of the spliceosomal complex and is a crucial enzyme
in a network of regulatory mechanisms that enable SR-rich proteins to control
RNA splicing(1). CLK1 particularly phosphorylates the serine residues in
SR-rich proteins that determine pre-mRNA splicing sites of
microtubule-associated protein tau which is implicated in Alzheimer’s disease
and Parkinson’s disease (2). Drug inhibitors for CLK1 could potentially reduce
further neurodegeneration of Alzheimer’s disease and Parkinson’s disease
patients by preventing tau microtubule tangles that cause neuron cell death
(3). For CLK1, pyrido[3,4-g]quinazoline derivative ZW290 (PQZ) successfully
inhibited the ATP binding socket and
allowed X-ray crystallography for all three CLK1 chains; A, B, and C (3).
Besides PQZ, glycerol is another ligand present in the crystal structure (1).
Several online tools were used to
determine CLK1’s features and find similar proteins based on CLK1’s primary and
tertiary structures. The CLK1 protein weighs 118513.27 Da and exhibits an
isoelectric point of 6.75 (4). CLK1’s protein sequence was run through two
programs, PSI-Blast and Dali server, to find comparative proteins for relevant
structural and functional analysis.
The PSI-Blast program compares a
protein’s primary structure to other
known protein primary structures and creates a list of similar proteins based
on their amino acid sequences. Each compared protein is assigned an E value
relative to the protein of interest (5). Gaps in similarity between the protein
of interest and other proteins are calculated and determine the protein E values
relative to the protein of interest. CLK1 was assigned an E value of 0; proteins
with an E value under 0.05 are considered very similar to the protein of
interest. Serine/Threonine-protein kinase YMR216C (PBD ID: 1HOW, S/TPK) has a
relative E score of 5e-26 to CLK1 and S/TPK therefore has a significantly similar
protein sequence to CLK1 (5). S/TPK is used later in this paper as CLK1’s
comparative protein.
The Dali server compares protein tertiary
structures and calculates a Z-score based on the differences in tertiary
structure intramolecular distances (6). The Dali server utilizes the
“sum-of-pairs” method to compare tertiary structures and creates relative
Z-scores based on similarities found between the protein of interest and other
proteins (6). The comparative protein S/TPK had a Z score of 34.0 (6). A
Z-score over two implies that the comparative protein has similar folds to the
protein of interest, and exhibits similar tertiary structure (6).
CLK1 is a trimer consisting of three
monomers that are nearly identical to one another, subunit A, B, and C (1). Both subunits A and B have a complete
protein sequence, however subunit C’s residues from 340-342 and 411-430 are
unknown and subunit C has a slightly different structure from subunits A and B
(1). Subunits A, B, and C are bound as a trimer with various bond types; salt
bridges, hydrogen bonds and non-bonded contacts (7).
Subunits B and C exhibit all three types
of bonds and are bound by a salt bridge from Lys-283 (B) to Glu-229 (C).
Hydrogen bonding occurs between Arg-469 of subunit B to Tyr-331 of subunit C. Thirty
non-bonded contacts occur between four residues of subunit B and five residues
of subunit C (7). Subunits A and B exhibit only five non-bonded contacts with
His-335 of subunit A connected to Lys-482 of subunit B and Lys-405 from subunit
A connected to Asn-219 from subunit B (7). Subunit A and C have two hydrogen
bonds and sixteen non-bonded contacts (7). There are two hydrogen bonds between
Thr-338 of subunit A and His-335 from subunit C (7). Sixteen non-bonded
contacts occur between six residues from subunit A and four residues from
subunit C (7). There were no disulfide bonds between the subunits (7).
CLK1 subunits
A and B have the same secondary and
tertiary structures. Subunit A and B each have five beta sheets, six beta
hairpins, six beta bulges, fourteen random coils, seventeen alpha helices,
twenty two helix-helix interactions, twenty seven beta turns, and four gamma
turns (7). Subunit C has four beta sheets, five beta hairpins, six beta bulges,
twelve random coils, thirteen alpha helices, twenty two helix-helix
interactions, twenty four beta turns, and three gamma turns (7).
Each subunit has an ATP binding site and
participates in phosphorylating SR-rich proteins. Key residues are Lys-191 and Asp-288 which act as the ATP binding
site and a proton acceptor, respectively (1). The ATP binding socket is located at Lys-191 for each subunit, which is
bound to ligand pyrido[3,4-g]quinazoline (PQZ) for X-ray crystallization of
CLK1 (1, 3). For subunits A and B, PQZ
has amine hydrogen bonds with the amine base on Lys-191 and has two hydrogen
bonds with Leu-244, one with the residue’s C-terminal and another with the
N-terminal (7). Van der Waals forces from residues Leu-167, Val-175, Ala-189,
Phe-241, Glu-242, Leu-243, Leu-295, and Val-324 also interact with PQZ and
provide additional binding to the ATP binding socket or groove (7). In subunit
C, the hydrogen bonding residues are the same, in that both Lys-191 and Leu-244
still form three total hydrogen bonds with PQZ. However, the two residues are
closer together on one side of PQZ and the substrate is not as “surrounded” by
the binding residues (7). Van der Waals forces from Leu-167, Val-175, Ala-189,
Phe-241, Glu-242, Leu-243, Leu-295, and Val-324 also bind to PQZ, however
subunit C’s tertiary structure varies from subunit A and B which yields an
exposed binding groove (7).
Besides the ATP binding groove inhibited
by PQZ, glycerol is the other ligand present. The glycerol molecules are not
important, but they are present because glycerol was the solvent used to
prepare the CLK1 – PQZ complex for crystallization (3). Glycerol is present on
subunits A and C. Glycerol exhibits hydrogen bonding with the hydroxyl groups
from Ser-384 and Tyr-411 on subunit A (7). Glycerol also exhibits hydrogen
bonding with the hydroxyl group from Ala- 183 and one of the amines from
Arg-186’s side chains in subunit C (7). The glycerol bound to subunit A is in
the pore between subunit A and C, however the glycerol bound to subunit C is
located between two beta sheets in proximity to the ATP binding site groove (7).
For each CLK1 subunit, the N-terminal
lobe consists of three beta strands followed by an alpha helix and two more
beta strands (8). The C-terminal lobe has an alpha helix at the bottom of the
lobe that is solvent-inaccessible from a large insertion between residues
400-432, as seen in Figure 1 (8). The region displays a helix-loop-two strand beta sheet followed by
an alpha helix that is unique to CLK structures (8). The C-terminal lobe top
also has a unique insertion where residues 300-317 form a beta-hairpin (8).
S/TPK tertiary structure is similar CLK1’s structure however S/TPK is a monomer and not a trimer. S/TPK resembles other protein kinases and CLK1
which was described previously; the N-terminal lobe is made primarily of beta sheets and the C-terminal is made
of alpha helices, as seen in Figures 2 and 3. However, S/TPK has four non-kinase core segments (9). Residues
153 and 154 from the N-terminus form a short beta-strand (beta 0) that extends
the small lobe beta-sheet to six antiparallel strands (9). The strand beta 0 is
preceded by an eight-residue segment in an extended conformation that caps off
the small lobe. The C-terminus of S/TPK also extends beyond the kinase core and
wraps around the bottom of the small lobe before terminating near the
activation loop (9). This C-terminal extension plays a role in maintaining the
constitutive activity of S/TPK. In addition to extensions at both ends of the
kinase core, there are also two inserts within the core. The first is an
11-amino acid insert (the alpha C' insert) between helix alpha C and beta 4.
This segment includes a small loop that deviates from the small lobe, forms a
seven-amino acid helix (alpha C') that interacts with the helix alpha E in the
large lobe, and then rejoins the small lobe. This insert appears to help
stabilize the orientation of helix alpha C (9). The second insert is 47 amino
acids long and lies between helix alpha G and helix alpha H in the large lobe
(9). S/TPK is constitutionally active like CLK1 since both of the proteins
automatically bind and position ATP correctly for phosphorylation. Both proteins consist of two lobes; the
N-terminal lobe with numerous beta sheets and the C-terminal lobe with more
alpha helices than beta sheets. Both CLK1 and S/TPK have alpha helices wrapping
around beta sheets which form the hollow insertion for ATP. The crucial
differences in tertiary and secondary structure were explained above, however
substrate specificity separates their functions. S/TPK specificity depends on
the Glu-294 residue, as its mutation to Serine would stop all phosphorylation
(9). CLK1 reacts to ATP presence by Glu-206 and Lys-191 binding to the
substrate and the acidic A loop is stabilized by polar contacts during
phosphorylation (8). Additionally, of the CLK C-terminal lobe is a long
insertion between the two sheets β7 and β8. This insert forms the CLK-specific
βhp-βhp′ hairpin that folds over a shallow groove created by the helices alpha
D and alpha E as seen in Figure 1 (8).
CLK1 is a dual specificity protein kinase that is responsible for neuronal differentation in Homo sapiens (8). Alternative DNA splicing is controlled by phosphorylation of serine and argine-rich splicing factors (8). CLK1 in paricular targets serine and arginine-rich substrates that have entered the nucleus which would then be phosphrylated at the serine-arginine rich sites (8). The activated protein would then move to be spliced and dephosphoryalted (8). Protein splicing regulation is responsible for regulating a single gene's generation of numerous protein isoforms (8). CLK1 efficacy is believed to be related to hereditary neuronal diseases, such as Alzheimer's disease (AD) or Parkinson's disease (3). Deficiencies in CLK1 dual specificity binding to specific serine-arginine rich protein binding sites in neurons, caused by hereditary AD mutations, could lead to tau microtubule tangle formations which induce neurodegeneration in AD patients (2, 3).